skip to main content


Search for: All records

Creators/Authors contains: "Blocki, Jeremiah"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available September 9, 2024
  2. We revisit computationally relaxed locally decodable codes (crLDCs) (Blocki et al., Trans. Inf. Theory ’21) and give two new constructions. Our first construction is a Hamming crLDC that is conceptually simpler than prior constructions, leveraging digital signature schemes and an appropriately chosen Hamming code. Our second construction is an extension of our Hamming crLDC to handle insertion-deletion (InsDel) errors, yielding an InsDel crLDC. This extension crucially relies on the noisy binary search techniques of Block et al. (FSTTCS ’20) to handle InsDel errors. Both crLDC constructions have binary codeword alphabets, are resilient to a constant fraction of Hamming and InsDel errors, respectively, and under suitable parameter choices have poly-logarithmic locality and encoding length linear in the message length and polynomial in the security parameter. These parameters compare favorably to prior constructions in the poly-logarithmic locality regime. 
    more » « less
    Free, publicly-accessible full text available June 25, 2024
  3. Proper communication is key to the adoption and implementation of differential privacy (DP). In this work, we designed explanative illustrations of three DP models (Central DP, Local DP, Shuffler DP) to help laypeople conceptualize how random noise is added to protect individuals’ privacy and preserve group utility. Following a pilot survey and an interview, we conducted an online experiment ( N = 300) exploring participants’ comprehension, privacy and utility perception, and data-sharing decisions across the three DP models. We obtained empirical evidence showing participants’ acceptance of the Shuffler DP model for data privacy protection. We discuss the implications of our findings.

     
    more » « less
    Free, publicly-accessible full text available September 1, 2024
  4. Free, publicly-accessible full text available July 10, 2024
  5. A central challenge in password security is to characterize the attacker's guessing curve i.e., what is the probability that the attacker will crack a random user's password within the first G guesses. A key challenge is that the guessing curve depends on the attacker's guessing strategy and the distribution of user passwords both of which are unknown to us. In this work we aim to follow Kerckhoffs's principal and analyze the performance of an optimal attacker who knows the password distribution. Let \lambda_G denote the probability that such an attacker can crack a random user's password within G guesses. We develop several statistically rigorous techniques to upper and lower bound \lambda_G given N independent samples from the unknown password distribution P. We show that our upper/lower bounds on \lambda_G hold with high confidence and we apply our techniques to analyze eight large password datasets. Our empirical analysis shows that even state-of-the-art password cracking models are often significantly less guess efficient than an attacker who can optimize its attack based on its (partial) knowledge of the password distribution. We also apply our statistical tools to re-examine different models of the password distribution i.e., the empirical password distribution and Zipf's Law. We find that the empirical distribution closely matches our upper/lower bounds on \lambda_G when the guessing number G is not too large i.e., G << N. However, for larger values of G our empirical analysis rigorously demonstrates that the empirical distribution (resp. Zipf's Law) overestimates the attacker's success rate. We apply our statistical techniques to upper/lower bound the effectiveness of password throttling mechanisms (key-stretching) which are used to reduce the number of attacker guesses G. Finally, if we are willing to make an additional assumption about the way users respond to password restrictions, we can use our statistical techniques to evaluate the effectiveness of various password composition policies which restrict the passwords that users may select. 
    more » « less
  6. The data management of large companies often prioritize more recent data, as a source of higher accuracy prediction than outdated data. For example, the Facebook data policy retains user search histories for months while the Google data retention policy states that browser information may be stored for up to months. These policies are captured by the sliding window model, in which only the most recent statistics form the underlying dataset. In this paper, we consider the problem of privately releasing the L2-heavy hitters in the sliding window model, which include Lp-heavy hitters for p<=2 and in some sense are the strongest possible guarantees that can be achieved using polylogarithmic space, but cannot be handled by existing techniques due to the sub-additivity of the L2 norm. Moreover, existing non-private sliding window algorithms use the smooth histogram framework, which has high sensitivity. To overcome these barriers, we introduce the first differentially private algorithm for L2-heavy hitters in the sliding window model by initiating a number of L2-heavy hitter algorithms across the stream with significantly lower threshold. Similarly, we augment the algorithms with an approximate frequency tracking algorithm with significantly higher accuracy. We then use smooth sensitivity and statistical distance arguments to show that we can add noise proportional to an estimation of the norm. To the best of our knowledge, our techniques are the first to privately release statistics that are related to a sub-additive function in the sliding window model, and may be of independent interest to future differentially private algorithmic design in the sliding window model. 
    more » « less
  7. Kiltz, E. (Ed.)
    The classical (parallel) black pebbling game is a useful abstraction which allows us to analyze the resources (space, space-time, cumulative space) necessary to evaluate a function f with a static data-dependency graph G. Of particular interest in the field of cryptography are data-independent memory-hard functions fG,H which are defined by a directed acyclic graph (DAG) G and a cryptographic hash function H. The pebbling complexity of the graph G characterizes the amortized cost of evaluating fG,H multiple times as well as the total cost to run a brute-force preimage attack over a fixed domain X, i.e., given y∈{0,1}∗ find x∈X such that fG,H(x)=y. While a classical attacker will need to evaluate the function fG,H at least m=|X| times a quantum attacker running Grover’s algorithm only requires O(m−−√) blackbox calls to a quantum circuit CG,H evaluating the function fG,H. Thus, to analyze the cost of a quantum attack it is crucial to understand the space-time cost (equivalently width times depth) of the quantum circuit CG,H. We first observe that a legal black pebbling strategy for the graph G does not necessarily imply the existence of a quantum circuit with comparable complexity—in contrast to the classical setting where any efficient pebbling strategy for G corresponds to an algorithm with comparable complexity for evaluating fG,H. Motivated by this observation we introduce a new parallel reversible pebbling game which captures additional restrictions imposed by the No-Deletion Theorem in Quantum Computing. We apply our new reversible pebbling game to analyze the reversible space-time complexity of several important graphs: Line Graphs, Argon2i-A, Argon2i-B, and DRSample. Specifically, (1) we show that a line graph of size N has reversible space-time complexity at most O(N^{1+2/√logN}). (2) We show that any (e, d)-reducible DAG has reversible space-time complexity at most O(Ne+dN2^d). In particular, this implies that the reversible space-time complexity of Argon2i-A and Argon2i-B are at most O(N^2 loglogN/√logN) and O(N^2/(log N)^{1/3}), respectively. (3) We show that the reversible space-time complexity of DRSample is at most O((N^2loglog N)/log N). We also study the cumulative pebbling cost of reversible pebblings extending a (non-reversible) pebbling attack of Alwen and Blocki on depth-reducible graphs. 
    more » « less
  8. Dodis, Y. (Ed.)
    Memory-hard functions (MHFs) are a useful cryptographic primitive which can be used to design egalitarian proof of work puzzles and to protect low entropy secrets like passwords against brute-force attackers. Intuitively, a memory-hard function is a function whose evaluation costs are dominated by memory costs even if the attacker uses specialized hardware (FPGAs/ASICs), and several cost metrics have been proposed to quantify this intuition. For example, space-time cost looks at the product of running time and the maximum space usage over the entire execution of an algorithm. Alwen and Serbinenko (STOC 2015) observed that the space-time cost of evaluating a function multiple times may not scale linearly in the number of instances being evaluated and introduced the stricter requirement that a memory-hard function has high cumulative memory complexity (CMC) to ensure that an attacker’s amortized space-time costs remain large even if the attacker evaluates the function on multiple different inputs in parallel. Alwen et al. (EUROCRYPT 2018) observed that the notion of CMC still gives the attacker undesirable flexibility in selecting space-time tradeoffs e.g., while the MHF Scrypt has maximal CMC Ω(N^2), an attacker could evaluate the function with constant O(1) memory in time O(N^2). Alwen et al. introduced an even stricter notion of Sustained Space complexity and designed an MHF which has s=Ω(N/logN) sustained complexity t=Ω(N) i.e., any algorithm evaluating the function in the parallel random oracle model must have at least t=Ω(N) steps where the memory usage is at least Ω(N/logN). In this work, we use dynamic pebbling games and dynamic graphs to explore tradeoffs between sustained space complexity and cumulative memory complexity for data-dependent memory-hard functions such as Argon2id and Scrypt. We design our own dynamic graph (dMHF) with the property that any dynamic pebbling strategy either (1) has Ω(N) rounds with Ω(N) space, or (2) has CMC Ω(N^{3−ϵ})—substantially larger than N^2. For Argon2id we show that any dynamic pebbling strategy either(1) has Ω(N) rounds with Ω(N^{1−ϵ}) space, or (2) has CMC ω(N^2). We also present a dynamic version of DRSample (Alwen et al. 2017) for which any dynamic pebbling strategy either (1) has Ω(N) rounds with Ω(N/log N) space, or (2) has CMC Ω(N^3/log N). 
    more » « less
  9. Ta-Shma, Amnon (Ed.)
    Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs. 
    more » « less
  10. Large-scale online password guessing attacks are widespread and pose a persistant privacy and security threat to users. The common method for mitigating the risk of online cracking is to lock out the user after a fixed number ($K$) of consecutive incorrect login attempts. Selecting the value of $K$ induces a classic security-usability trade-off. When $K$ is too large, a hacker can (quickly) break into a significant fraction of user accounts, but when $K$ is too low, we will start to annoy honest users by locking them out after a few mistakes. Motivated by the observation that honest user mistakes typically look quite different from an online attacker's password guesses, we introduce $\DALock$, a {\em distribution-aware} password lockout mechanism to reduce user annoyance while minimizing user risk. As the name suggests, $\DALock$ is designed to be aware of the frequency and popularity of the password used for login attacks. At the same time, standard throttling mechanisms (e.g., $K$-strikes) are oblivious to the password distribution. In particular, $\DALock$ maintains an extra ``hit count" in addition to ``strike count" for each user, which is based on (estimates of) the cumulative probability of {\em all} login attempts for that particular account. We empirically evaluate $\DALock$ with an extensive battery of simulations using real-world password datasets. In comparison with the traditional $K$-strikes mechanism, {our simulations indicate that} $\DALock$ offers a superior {simulated} security/usability trade-off. For example, in one of our simulations, we are able to reduce the success rate of an attacker to $0.05\%$ (compared to $1\%$ for the $3$-strikes mechanism) whilst simultaneously reducing the unwanted lockout rate for accounts that are not under attack to just $0.08\%$ (compared to $4\%$ for the $3$-strikes mechanism). 
    more » « less